Goto

Collaborating Authors

 estimate cost-to-goal


Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Neural Information Processing Systems

Figure 1: Problem instance where perfect heuristic is not strictly optimally efficient with GBFS. However, the path (A, C,D, E) has cost 10 instead of 11 . Then h is a perfect ranking for GBFS on Γ. Proof. We carry the proof by induction with respect to the number of expanded states. Let's now make the induction step and assume the theorem holds for the first A 0 B 1 C 1 D 2 A 1 1 9 9 1 Figure 2: Problem instance where optimally efficient heuristic does not exists for GBFS.


Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Neural Information Processing Systems

In imitation learning for planning, parameters of heuristic functions are optimized against a set of solved problem instances. This work revisits the necessary and sufficient conditions of strictly optimally efficient heuristics for forward search algorithms, mainly A* and greedy best-first search, which expand only states on the returned optimal path. It then proposes a family of loss functions based on ranking tailored for a given variant of the forward search algorithm. Furthermore, from a learning theory point of view, it discusses why optimizing cost-to-goal h* is unnecessarily difficult. The experimental comparison on a diverse set of problems unequivocally supports the derived theory.


Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Neural Information Processing Systems

Figure 1: Problem instance where perfect heuristic is not strictly optimally efficient with GBFS. However, the path (A, C,D, E) has cost 10 instead of 11 . Then h is a perfect ranking for GBFS on Γ. Proof. We carry the proof by induction with respect to the number of expanded states. Let's now make the induction step and assume the theorem holds for the first A 0 B 1 C 1 D 2 A 1 1 9 9 1 Figure 2: Problem instance where optimally efficient heuristic does not exists for GBFS.


Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Neural Information Processing Systems

In imitation learning for planning, parameters of heuristic functions are optimized against a set of solved problem instances. This work revisits the necessary and sufficient conditions of strictly optimally efficient heuristics for forward search algorithms, mainly A* and greedy best-first search, which expand only states on the returned optimal path. It then proposes a family of loss functions based on ranking tailored for a given variant of the forward search algorithm. Furthermore, from a learning theory point of view, it discusses why optimizing cost-to-goal h* is unnecessarily difficult. The experimental comparison on a diverse set of problems unequivocally supports the derived theory.